Discriminative speaker adaptation using articulatory features

نویسنده

  • Florian Metze
چکیده

This paper presents an automatic speech recognition system using acoustic models based on both sub-phonetic units and broad, phonological features such as Voiced and Round as output densities in a hidden Markov model framework. The aim of this work is to improve speech recognition performance particularly on conversational speech by using units other than phones as a basis for discrimination between words. We explore the idea that phones are more of a short-hand notation for a bundle of phonological features, which can also be used directly to distinguish competing word hypotheses. Acoustic models for different features are integrated with phone models using a multi-stream approach and log-linear interpolation. This paper presents a new lattice based discriminative training algorithm using the maximum mutual information criterion to train stream weights. This algorithm allows us to automatically learn stream weights from training or adaptation data and can also be applied to other tasks. Decoding experiments conducted in comparison to a non-feature baseline system on the large vocabulary English Spontaneous Scheduling Task show reductions in word error rate of about 20% for discriminative model adaptation based on articulatory features, slightly outperforming other adaptation algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On using Articulatory Features for Discriminative Speaker Adaptation

This paper presents a way to perform speaker adaptation for automatic speech recognition using the stream weights in a multi-stream setup, which included acoustic models for “Articulatory Features” such as ROUNDED or VOICED. We present supervised speaker adaptation experiments on a spontaneous speech task and compare the above stream-based approach to conventional approaches, in which the model...

متن کامل

Speaker adaptation of an acoustic-articulatory inversion model using cascaded Gaussian mixture regressions

The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...

متن کامل

Speaker adaptation of an acoustic-to-articulatory inversion model using cascaded Gaussian mixture regressions

The article presents a method for adapting a GMM-based acoustic-articulatory inversion model trained on a reference speaker to another speaker. The goal is to estimate the articulatory trajectories in the geometrical space of a reference speaker from the speech audio signal of another speaker. This method is developed in the context of a system of visual biofeedback, aimed at pronunciation trai...

متن کامل

Using Articulatory Information for Speaker Adaptation

Articulatory Features (AF) have proven beneficial for Automatic Speech Recognition (ASR) in noisy environments, for hyper-articulated speech or in multi-lingual settings. A stream setup can combine standard sub-phone Gaussian Mixture Models with feature GMMs; the weights assigned to each feature stream such as VOICED or BILABIAL could intuitively be used for adaptation to speaker or text. In th...

متن کامل

Articulatory features for conversational speech recognition

While the overall performance of speech recognition systems continues to improve, they still show a dramatic increase in word error rate when tested on different speaking styles, i.e. when speakers for example want to make an important point during a meeting and change from sloppy speech to clear speech. Today’s speech recognizers are therefore not robust with respect to speaking style, althoug...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2007